Preparations

Load the necessary libraries

library(car) # for regression diagnostics
library(broom) # for tidy output
library(ggfortify) # for model diagnostics
library(sjPlot) # for outputs
library(knitr) # for kable
library(effects) # for partial effects plots
library(ggeffects) # for partial effects plots
library(emmeans) # for estimating marginal means
library(MASS) # for glm.nb
library(MuMIn) # for AICc
library(tidyverse) # for data wrangling
library(broom.mixed)
library(nlme) # for lme
library(lme4) # for lmer
library(lmerTest) # for Satterthwaite's p-values
library(glmmTMB) # for glmmTMB
library(DHARMa) # for residuals and diagnostics
library(performance) # for diagnostic plots
library(see) # for diagnostic plots

Scenario

Starlings

Sampling design

Format of starling_full.RSV data files

SITUATION MONTH MASS BIRD
tree Nov 78 tree1
.. .. .. ..
nest-box Nov 78 nest-box1
.. .. .. ..
inside Nov 79 inside1
.. .. .. ..
other Nov 77 other1
.. .. .. ..
tree Jan 85 tree1
.. .. .. ..
SITUATION Categorical listing of roosting situations (tree, nest-box, inside or other)
MONTH Categorical listing of the month of sampling.
MASS Mass (g) of starlings.
BIRD Categorical listing of individual bird repeatedly sampled.

This is a split-plot (or repeated measures) design. The individual birds are the blocks, the Situation is the between block effect and the Month is the within block effect. Repeated measures analyses involve a within block effect that represents time (in this case Month). Since it is not possible to randomise the order of time, repeated measures designs have the potential for the residuals to be auto-correlated. That is, rather than being independent, residuals from observations that are closer in time, tend to be more similar (correlated) than the residuals associated with observations that are further apart in time.

That said, with only two time points, auto-correlation is not possible.

Read in the data

starling <- read_csv("../public/data/starling_full.csv", trim_ws = TRUE)
glimpse(starling)
## Rows: 80
## Columns: 5
## $ MONTH      <chr> "Nov", "Nov", "Nov", "Nov", "Nov", "Nov", "Nov", "Nov", "No…
## $ SITUATION  <chr> "tree", "tree", "tree", "tree", "tree", "tree", "tree", "tr…
## $ subjectnum <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1…
## $ BIRD       <chr> "tree1", "tree2", "tree3", "tree4", "tree5", "tree6", "tree…
## $ MASS       <dbl> 78, 88, 87, 88, 83, 82, 81, 80, 80, 89, 78, 78, 85, 81, 78,…

Lets prepare the data:

starling <- starling %>% mutate(
  MONTH = factor(MONTH, levels = c("Nov", "Jan")),
  SITUATION = factor(SITUATION),
  BIRD = factor(BIRD)
)

Exploratory data analysis

Model formula: \[ y_i \sim{} \mathcal{N}(\mu_i, \sigma^2)\\ \mu_i =\boldsymbol{\beta} \bf{X_i} + \boldsymbol{\gamma} \bf{Z_i} \]

where \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\) are vectors of the fixed and random effects parameters respectively and \(\bf{X}\) is the model matrix representing the overall intercept and effects of roosting situation and month on starling mass. \(\bf{Z}\) represents a cell means model matrix for the random intercepts associated with individual birds.

Exploratory data analysis

Fit the model

Model validation

Partial plots

Model investigation / hypothesis testing

Further analyses

Summary figures

References